home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Internet Info 1994 March
/
Internet Info CD-ROM (Walnut Creek) (March 1994).iso
/
inet
/
ietf
/
mtudisc
/
mtudisc-minutes-90feb.txt
< prev
next >
Wrap
Text File
|
1993-02-17
|
14KB
|
305 lines
CURRENT_MEETING_REPORT_
Reported by Jeffrey Mogul/DEC
AGENDA
(a) Report on current draft (McCloghrie/Fox/Mogul)
(b) Review other alternatives
(c) Review goals and assumptions
(d) Obtain consensus on approach
(e) Focus on details
(f) What next?
MINUTES
This was the second meeting of the MTU Discovery Working Group.
We started with a quick presentation by Keith McCloghrie of the draft
that he and Rich Fox wrote based on the apparent consensus of the
December meeting. Some attendees had not read the draft, and we tried
to ensure that everyone understood the basic outline. [Summary:
senders occasionally attach an IP PTMU-Query Option to their datagrams.
Routers update the PMTU value in the option; the last-hop router returns
the PMTU to the sender using the ICMP Path-MTU message. If the
destination host detects a change in the MTU (when a fragment is
received), it sends an ICMP Unexpected Fragment Report message.]
We also reviewed the "Steve Deering" proposal from last year, as there
was a realization that it might not be dead, after all. Among other
things, we now know that there are not 1 but 4 spare bits in the IP
header (there are 3 unused in the TOS field), and that the powers that
be might therefore be likely to let us use one. [Summary of Deering
proposal: senders often send datagrams with "RF" (Report Fragmentation)
bit set in the IP header. A host receiving fragment-0 of a datagram
with RF set sends an ICMP Fragmentation Occurred message.]
We then started a fairly unstructured discussion comparing the costs and
benefits of the two approaches.
1. Lifetime of protocol: on the one hand, in principle MTU discovery
should be obviated by the coming revolution in routing protocols.
Within "a few" years, the routing protocols will provide path-MTU
information, so MTU discovery will be unnecessary. Of course, we
all know about things that are supposed to happen "real soon now";
we particularly all know about relatively new things that
"everyone" implements. Still, while avoiding the trap of assuming
that the world will be perfect in just a couple of years, it may
not be worth trying to solve the problem of MTU discovery for all
time, since it may not be useful for that long.
2. Rapidity of deployment: Clearly, MTU discovery of any form only
works for a sender if some subset of the other nodes (routers
1
and/or destinations) suport it. Query-based schemes depend upon
support from a large fraction of the routers; RF-style schemes only
help if a large fraction of the end-hosts support it. There was
some debate about which population is more likely to upgrade soon
(routers or end-hosts). No consensus was reached.
3. Connection lifetimes: Van's data suggest that most non-local TCP
connections are short (ca. 4 datagrams). This makes some sense
(mostly SMTP) although this is only one sample point, and we agreed
that more data would be useful. Van argued that this works against
a query-based scheme, since by the time one has useful information,
there's not much left to do with it. His argument in favor of the
RF scheme was that the right way to use it is to assume that you
can send large datagrams (sized by your first-hop MTU, or perhaps
some estimate of the NSFNET PMTU, ca. 1500), and let the
destination tell you if you are screwing up.
In general, we realize that fragmentation is not inherently evil.
Although it might create some extra overhead for the routers, what
we really have to avoid is the "deterministic fragment loss"
problem which causes connections to stall. Thus, (I hope I am
correctly paraphrasing Van's argument) it is only worth doing for
connections that last a while, either because they are carrying
lots of data, or because they are stalled due to fragment loss.
Query-based schemes waste router resources because processing IP
options is expensive, and the payoff is unlikely.
It was argued that, since the senders cache the MTU values learned
by either scheme in the per-host routing entries, querying would
not have to be done on every connection to be useful. Again, Van
drew on his traffic studies to suggest that (even over a 12-hour
period) there was generally little correlation between connections
... that is, just because one pair of hosts makes a connection
does not mean that they will do so any time soon. Some of us did
not believe that is necessarily true (for example, how much traffic
comes from mail-hub machines like DECWRL and UUNET?) Again, we
agreed that it would be nice to have more traffic data available.
4. Complexity: Now that the draft specification for the query-based
scheme is done, we realized that it is a lot more complex than we
thought. One problem is the number of tunable parameters. Since
the RF scheme doesn't require the receiver to maintain any state
about the sender [actually, this is not quite true, as noted
later], doesn't require the sender to schedule when to send the
option, doesn't cause the receiver to send notifications when
intentional fragmentation occurs [NFS would probably not set RF],
and it requires no support at all from the routers, it appears to
be simpler [but keep reading].
After this discussion, it was pretty clear that the consensus had
shifted to trying to use the RF scheme. We made the assumption that we
could get a header bit (Van argued that although the RF scheme could be
done using an option, the cost/benefit analysis might be against it).
The next step was to explore how well that would really work.
One problem that came up right away is that James VanBokkelen believes
there to exist many PC-based systems that (1) do not reassemble
2
fragments (2) do advertise MSS values of 1500 to non-local peers
Currently, these hosts function because the 576-if-nonlocal rule
observed by most non-PC hosts means that, given today's Internet, even
when they advertise an MTU of 1500 to a non-local host, the host at the
other end will not send datagrams big enough to be fragmented. [I
suppose it is unlikely for two PCs to talk to each other over long
distances.] However, if we use the simplest RF scheme, these hosts are
going to get fragmented datagrams. Since we assume that any host which
implements MTU discovery is also in conformance with the other rules
(specifically, fragmentation reassembly), we therefore know that such
sub-standard PCs won't send the ICMP Fragmentation Occurred message, and
these connections would stall.
The obvious fix is to not invoke MTU discovery (i.e., not send segments
> 576 bytes) unless you are sure that the other end supports it. This
means that you have to have seen a datagram with RF set coming back to
you from the destination before you can send large datagrams.
More subtly, since we don't want to mislead these stupid PCs (which
apparently don't follow the 576-byte rule in either direction) you
cannot even send an MSS > 576 to a non-local peer until you have seen an
RF bit from it. Thus, since the TCP MSS option can only be sent on the
SYN datagram, a host initiating a TCP connection may not be able to use
MTU discovery (and large segments) unless it has talked with the other
end recently. (The second host is in a better position; since it sees
the RF bit before it has to sends its own MSS option, it can set a large
MSS immediately. This is nice for FTP retrieves; it doesn't help for
SMTP, alas).
The consensus was that this limitation was acceptable, since it erred on
the conservative side. (Although it errs on the case of the most common
connection-type [SMTP], since SMTP connections are normally short we
wouldn't gain much anyway.) When two connections are made in quick
succession, things work nicely (e.g., several mail messages, or the
control connection of an FTP session followed by the data connection.
The control connection will seldom carry large segments, but the
exchange of RF bits done then will allow the data connection to use
large segments right away.)
Mike Karels proposed (off-the-cuff, not necessarily believing that it
was right) that routers fragmenting a datagram with RF set could also
send the fragmentation-occurred ICMP. This seemed to create problems
given the requirement for handshaking imposed by the broken-PC crowd, so
Mike agreed to go off and think about this one.
One question arose about the use of a previously unused bit in the IP
header: what would current implementations do if they see it set? (We
know that we can safely add options, since by definition these are
ignored if not known.) While the IP spec says these bits must be zero,
the "robustness principle" implies that routers and hosts should ignore
them. Unfortunately, John Moy from Proteon admitted that Proteon
routers drop such datagrams, and Noel Chiappa says that this is true of
other implementations based on his old MIT "C-gateway" code. We have to
3
find out just how bad this is going to be; perhaps Proteon will be able
to upgrade all of its customers before MTU discovery is widely
implemented.
[Side note: Clearly, implementations contrary to the basic IP spec are
causing us serious grief. How much do we twist the protocol to
accomodate them?]
An orthogonal issue is that in high-speed long-distance networks, there
might be lots of packets in flight when the route changes to one with a
lower MTU (e.g., on a satellite link with a half-second RTT, 4kb
packets, and 100 Mbit/sec channel, this means 1500 packets per RTT!)
Since the source cannot react to a Fragment Occurred message sooner than
one RTT worth of packets after the one that triggered the message, we
are concerned that setting the RF bit on every packet could lead to
positive (i.e., anti-stability) feedback in a network that is loosing
capacity.
This could be attacked in two ways: limit the rate at which the RF bit
is sent, or limit the rate at which the ICMP is sent. The former could
be done "once per RTT", once per some constant time period, or perhaps
once per window. It's not clear if there is a convenient way of marking
out the boundaries between windows
ACTION ITEMS
1. Noel Chiappa and Van Jacobson were assigned to try to get the IESG
to free up an IP header bit.
2. Mike Karels was going to think more about having routers send ICMPs
when they fragment.
3. We need to determine how many routers will drop packets with RF
set, and how hard it will be to fix this. Is it any different if
we use one of the bits in the TOS area?
4. Ditto for end-hosts; are there any that drop such packets?
5. The Router Requirements WG was known to be considering changing the
way that fragmentation was done (fragment into equal-size pieces;
currently, routers are supposed to send N maximal-size fragments
and one smaller one). This would make the RF scheme nearly
useless. [Phil Almquist says that the RRWG will work with us on
this, so it shouldn't be a problem].
6. Perhaps more traffic studies would be useful.
7. Someone has to write the next draft. Keith and Rich were thanked
for their hard work, on their draft that is now tabled, and were
not coerced into starting a different document. Since Van was the
fiercest proponent of RF at the meeting, he was given
responsibility to see to it that the draft is written. He agreed
but said he was going to try to get Steve Deering to do the work
(Steve was absent due to serious thesis time-pressure, so maybe Van
is going to be stuck with it.) The chair requested a draft within
one month (7 March 1990).
8. James VanBokkelen was going to see just how many hosts out there
4
are unable to reassemble fragmented IPs, how hard it would be to
fix this, how many vendors are involved, etc.
IESG ACTION
On Thursday, February 8, at the open IESG meeting, the IESG was asked to
allow this bit to be used for MTU discovery. I was not there, but I
understand that the IESG is willing to release this bit if we come to a
consensus on a protocol that they think is reasonable.
SCHEDULE
We expect to meet again at the May IETF meeting.
At that point, we will probably either adopt one of the schemes, or give
up.
5
ATTENDEES
Ballard Bare bare%hprnd@hplabs.hp.com
Art Berggreen art@sage.acc.com
Richard Bosch probe@mit.edu
Ron Broersma ron@nosc.mil
John Cavanaugh John.Cavanaugh@StPaul.ncr.com
Noel Chiappa jnc@LCS.MIT.EDU
James Davin jrd@ptt.lcs.mit.edu
Farokh Deboo sun!iruucp!ntrlink!fjd
Rich Fox sytek!rfox@sun.com
Van Jacobson van@lbl-csam.arpa
Mike Karels karels@berkeley.edu
Mike Marcinkevicz mdm@gumby.dsd.trw.com
Tony Mason mason@transarc.com
Keith McCloghrie sytek!kzm@hplabs.HP.COM
Bill Melohn melohn@sun.com
Jeff Mogul mogul@decwrl.dec.com
John Moy jmoy@proteon.com
Drew Perkins ddp@andrew.cmu.edu
Michael Petry petry@trantor.umd.edu
Nuggehalli Pradeep pradeep@orville.nas.nasa.gov
Mark Rosenstein mar@athena.mit.edu
Tony Staw staw@marvin.enet.dec.com
James VanBokkelen jbvb@ftp.com
John Veizades veizades@apple.com
Steve Willis swillis@wellfleet.com
John Wobus JMWobus@suvm.acs.syr.edu
David Zimmerman dpz@convex.com
6